arXiv:1503.05215vl [stat.AP] 17 Mar 2015 


Age-Specific Mortality and Fertility Rates for 
Probabilistic Population Projections 


Hana Sevci'kova, University of Washington 
Nan Li, United Nations 
Vladinhra Kantorova, United Nations 
Patrick Gerland, United Nations 
Adrian E. Raftery, University of Washington 


August 8, 2016 



Abstract 

The UN released official probabilistic population projections (PPP) for all countries for the 
first time in July 2014. These were obtained by projecting the period total fertility rate 
(TFR) and life expectancy at birth (eo) using Bayesian hierarchical models, yielding a large 
set of future trajectories of TFR and e 0 for all countries and future time periods to 2100, 
sampled from their joint predictive distribution. Each trajectory was then converted to 
age-specific mortality and fertility rates, and population was projected using the cohort- 
component method. This yielded a large set of trajectories of future age- and sex-specific 
population counts and vital rates for all countries. In this paper we describe the method¬ 
ology used for deriving the age-specific mortality and fertility rates in the 2014 PPP, we 
identify limitations of these methods, and we propose several methodological improvements 
to overcome them. The methods presented in this paper are implemented in the publicly 
available bayesPop R package. 

Keywords: Bayesian hierarchical model; Cohort-component method; Life expectancy at 
birth; Markov chain Monte Carlo; Total fertility rate; United Nations; World Population 
Prospects. 
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1 Introduction 


The United Nations released official probabilistic population projections for all countries for 


the first time in July 2014 (Gerland et ah, 2014). They were produced by probabilistically 
projecting the period total fertility rates (TFR) and life expectancies (eo) for all countries 
using Bayesian hierarchical models (Alkema et al. 2011 Raftery et al. 2013). These prob¬ 
abilistic projections took the form of a large set of trajectories, each of which was sampled 
from the joint predictive distribution of TFR and female and male eo for all countries and 
all future time periods to 2100 using Markov chain Monte Carlo (MCMC) methods Q 

For each trajectory, the life expectancies were converted to age- and sex-specific mortality 
rates, and the total fertility rates were converted to age-specific fertility rates. The population 
was then projected forward using the cohort-component method. This yielded a large set 
of trajectories of population by age and sex, and age-specific fertility and mortality rates, 
for all countries and future time periods jointly. These were summarized by predictive 
medians and 80% and 95% prediction intervals for a wide range of population quantities 
of interest, for all countries and a wide range of regional and other aggregates. They were 
published as the UN’s Probabilistic Population Projections (PPP), and are available at 
http://esa.un.org/unpd/ppp. 

This paper focuses on the methods used to convert probabilistic projections of e 0 and 
TFR to probabilistic projections of age-specific mortality and fertility rates. Some limitations 
of the methods used for the 2014 PPP are identified, and several improvements are proposed 
to overcome them. The methods presented in this paper are implemented in an open source 
R package called bayesPop (Sevcfkova & Raftery, 2014 Sevcfkova et al., 2014). 

The paper is organized as follows. In Section [2] we describe the current method in PPP 


for projecting age-specific mortality rates and our proposed improvements. In Section 2.1 


we outline the Probabilistic Lee-Carter method used in the 2014 PPP. In the rest of Section 
[2] we propose several improvements to overcome limitations of this method. These include a 
new Coherent Kannisto Method for joint projection of future age-specific mortality rates at 


very high ages that avoids unrealistic crossovers between the sexes (Section 2.2), application 
of the Coherent Lee-Carter method to avoid crossovers at lower ages (Section 2.3), new 
methods for avoiding jump-off bias (Section |2.4[), and application of the Rotated Lee-Carter 


lr This general approach applies to countries experiencing normal mortality trends. For countries having 
ever experienced 2 per cent or more adult HIV prevalence during the period 1980 to 2010, all projected 
trajectories of life expectancy by sex for each of these countries were adjusted in such a way as to ensure 
that the median trajectory for each country was consistent with the 2012 Revision of the World Population 
Prospects deterministic projection that incorporates the impact of HIV/AIDS on mortality, as well as as¬ 
sumptions about future potential improvements both in the reduction of the epidemic and survival due to 
treatment 
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method to reflect the fact that when mortality rates are low, they tend to decline faster at 
older than at younger ages (Section 2.5). In Section [3j we describe the current method in 
PPP for projecting age-specific fertility rates and our proposed improvements. We conclude 
with a discussion in Section HI 


2 Age-Specific Mortality Rates for Probabilistic Pop¬ 
ulation Projections 

2.1 Probabilistic Lee-Carter Method 


Our methodology is based on the Lee-Carter model (Lee & Carter, 1992): 


log [m x (t)] = a x + b x k(t ) + £ x (t), £ x (t) ~ iV(0, erf), 


where m x (t ) is the mortality rate for age x and time period t. The quantity a x represents 
the baseline pattern of mortality by age over time, and b x is the average rate of change in 
mortality rate by age group for a unit change in the mortality index k(t). The parameter 
k{t) is a time-varying index of the overall level of mortality, and £ x (t) is the residual at age 
x and time t. Throughout this paper, log denotes the natural logarithm. 

For a given matrix of rates m x (t ) the model is estimated by a least squares method. The 
baseline mortality pattern a x is estimated as the average of log[m x .(t)] over the past time 
periods with observed data. Since the model is underdetermined, b x is identified by setting 
Yh x b x = 1, where the sum is over all ages or age groups x. Also, k{t ) is identified by setting 
E^Li k{t) = 0, where T is the number of past time periods for which data are available. The 
estimates are then 


k(t) 

b x 


EL log [m x (t)} 
T 

^{log [m x (t)] - 


da;} 


EL (log [m x (t)\ - a x } k(t ) 

Ef=i %) 2 


(1) 

( 2 ) 

(3) 


To forecast m x (t), one needs to project k(t ) into the future. To project k(t), the Lee- 
Carter method uses a random walk with drift: 


k(t + 1) = k{t) + d , where d 


-F-to-iti)]. 


Lee & Miller (2001) proposed replacing the step of projecting k{t) by itself by matching 


future k{t) to future projected eo(£). 
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Current calculations are done using a highest age or open interval of 85+. For projections 
one needs to extend mortality rates to higher ages x, usually beyond 100+, because mortality 
rates are expected broadly to decline over time in the future, so there will be larger numbers 
of people at higher ages. For extending the force of mortality at older age groups, the 
Kannisto model provides a robust way to fit available mortality rates from age 80 to 100, 
and to extrapolate mortality rates up to age 130 in a way that is consistent with empirical 


observations on oldest-old mortality (Thatcher et al. 1998). 

The Bayesian probabilistic projections of life expectancy (Raftery et ah, 2013, 2014) 
provide us with a set of future trajectories of female and male eo, representing a sample from 
the joint predictive distribution of future female and male eo for all countries and all future 
time periods. The 2014 PPP used methods for turning a trajectory of future eo values into 


a set of future age-specific mortality based on the ideas of Lee & Miller (2001) and Li & 


Gerland (2011); see Raftery et al. (2012). They were based on the following algorithm: 


Algorithm 1 

Let t G {1,..., T} and r G {T + 1,..., T p } denote the observed and projected time periods, 
respectively. 


1. Using the Kannisto method extend m x {t ) to higher age groups so that max(a;) = 130+ 
for all t. 

2. Estimate a x , k(t ) and b x using the extended historical m x (t) (equations [l]-[3|. 

3. For a given e 0 (r) in each trajectory and given a x and b x , solve for future k(r) numeri¬ 
cally using life tables. This yields a nonlinear equation which can be solved using the 


bisection method. More details are given in Section 2.6 


4. Compute mortality rates by log[ma;(r)] — a x + b x k{r) for each trajectory and future 
time r. 


Applying these steps to all trajectories of e 0 yields a posterior predictive distribution of 
m x (t). 

However, this procedure has a number of drawbacks. There is no assurance that the 
extension of m x (t ) to higher ages yields mortality rates that are coherent between males and 
females. Similarly, the predicted m x (r) can lead to unwanted crossovers between female and 
male mortality rates, since they are obtained independently for each sex. In the following 
sections, we present solutions to these and other limitations of the simple algorithm above, 
and give more details about Step 3. 
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2.2 Coherent Kannisto Method 


A sex-independent extension of the observed mortality rates to higher age categories can 
lead to unrealistic crossovers at higher ages. We propose a modification of the Kannisto 
method that treats male and female mortality rates jointly. In this section, for simplicity we 
omit the time index t. 

The original Kannisto model has the form 


m x 

logit (m*) 


ce 


dx 


1 + ce dx ’ 
log c + dx + e 


or 


XI 


where e x is a random perturbation with mean zero. The model is usually estimated inde¬ 
pendently for each sex, assuming independence across ages and normality of the e x , using 
a maximum likelihood method (Thatcher et al. 1998; Wilmoth et al. 2007[ ). This yields 
sex-specific parameter estimates <3 m, dp, cm, c F - 

We suggest modifying this by forcing the sex-specific parameters d M and d F to be equal 
(i.e. dM = d F = d), but still allowing the parameters cm and c F to differ between the sexes: 


logit (m®) = log c g + dx + e 9 x , for g — M,F. 
This leads to the following model: 


logit (m®) = /3o + fdil( g =M) + fax + e 9 x , 


where 1 ( 9 =m) = 1 if g = M and 0 otherwise. 

To estimate the f3 parameters, we fit the model to the observed m x for ages 80-99 
by ordinary least-squares regression, which corresponds to maximum likelihood under the 
assumptions of independence and normality of the e 9 x . There are four age groups in the data 
used for fitting the model, and thus eight points in total for both sexes. Then, 


Cp 


e /% 


1 


cm 


e Po+Pl 


d = 


/32- 


Figure [I] shows the resulting rn x for old ages for Brazil and Lithuania in the last observed 
time period. From the left panels we see that there are crossovers using the classic Kannisto 
method, which is unrealistic. However, male mortality stays above female mortality in the 
coherent version, as can be seen in the right panels; this is more realistic. 
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Figure 1: Mortality rates for male (blue) and female (red) extended using the coherent 
Kannisto method (right panels) compared to original Kannisto (left panels) for Brazil and 
Lithuania in the last observed time period (2005-2010). 


2.3 Coherent Lee-Carter Method 


We will adopt an extension of the Lee-Carter method suggested by Li & Lee (2005), the 


so-called coherent Lee-Carter method. It takes into account the fact that mortality patterns 
for closely related populations are expected to be similar. In our application, these related 
populations will be males and females in the same country, since there is no expectation 
that the life expectancy will diverge between such groups. Thus, the Lee-Carter method is 
extended by two requirements: 


b M = h F 

U X U X 5 

k M (r) = k F (r), 


(4) 
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where M and F denotes male and female sex, respectively. This ensures that the rates 
of change of the future mortality rates are the same for the two sexes, and thus avoids 
crossovers. 


2.4 Avoiding Jump-off Bias 

Mortality rates in the last period of the historical data used for estimation (or jump-off 


period) are commonly referred to as jump-off rates (Booth et al. 2006). Often there is a 
mismatch between fitted rates for the last period T and the actual rates (jump-off bias). 
As a result, a discontinuity between the actual rates in the jump-off period and the rates 
projected in the first projection period may occur. 

A possible solution to avoid jump-off bias is to constrain the model in such a way that 
k(t) passes through zero in the jump-off period T, and to use m x only from the last fitting 


period to obtain a x (Lee & Miller, 2001): 


a x = log[m x (T)] => k(T) = 0. 


(5) 


A disadvantage of this solution is that in cases where the mortality rates are bumpy in 
the jump-off period (i.e. not smooth across ages), this “bumpiness” propagates into the 
future. In general for projections, we suggest using the age-specific mortality rates from the 
last fitting period and smoothing them over age if necessary (e.g. for small populations with 
few deaths in some age groups) while preserving the value for the youngest age group: 

a x = smooth i: {log[m x (T)]} with a 0 -i = log[m 0 _i(T)]. (6) 

Figure [2] shows the resulting difference in log[m x (r)] projected to r = 2095 — 2100 for two 
countries using the three different methods of computing a x , namely equations ([Tj) , ([5]) and 
(§- As can be seen in the case of Bangladesh, the smoothing step removes bumps whereas 
the averaging method does not. 

Figure [3] shows the impact of the methods on rn x as time series for Bangladesh for three 
different age groups. Using the average m x results in jump-offs for the 5-9 and 95-99 age 
groups (blue curve). If the latest raw m x are used, the jump-offs are eliminated (grey curve). 
A smoothed version creates a new jump-off for the age group 75-79 (red curve). 

This shows that there is a trade-off between bumpy mortality rates over ages in later 
projection years and no jump-offs, and smooth mortality rates with no jump-offs. Our 
solution is to decide on a country-specific basis which method is more appropriate. 









o 


Pakistan Female 



Figure 2: Log female mortality rates for Bangladesh and Pakistan in 2095-2100 projected 
using three different methods for computing a x : ( 1 ) using an average m x over time (blue 
line); (2) using the latest smoothed m x (red line); and (3) using the latest m x as it is (grey 

line). 


2.5 Rotated Lee-Carter Method 


Li et al. (2013) focused on the fact that in more developed regions, once countries have 


already reached a high level of life expectancy at birth, the mortality decline decelerates at 
younger ages and accelerates at old ages. This change in the pace of mortality decline by 
age cannot be captured by the original Lee-Carter method, since this constrains the rate of 
change b x to be constant over time. They proposed instead rotating the b x over time to a 
so-called ultimate b x , denoted by b u _ x , which is computed as follows. 

Let 

1 60-64 

^15-64 = b x . 


1 = 15-19 


Then 


bu,X 


6 i 5_64 for x G {0 — 1,1 — 4, 5 — 9,..., 60 — 64}, 

b x ■ &«, 60 — 64 /& 65—70 for x G (65 - 70,..., 130+}, 


( 7 ) 


with b ux scaled to sum to unity over all ages. 
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Figure 3: Mortality rates for Bangladesh by time for three different age groups. Colors 
correspond to the same methods as in Figure [2] 


The rotation is dependent on e 0 (r), and so the resulting b x also becomes time-dependent. 
The rotation finishes at a certain level of live expectancy, denoted by eg. Li et al. (2013) 
recommend using eg = 102. Using the smooth weight function 


r 7T 


1 + sin[—(2u/(r) - 1)] 


with w'(r ) = 


w(t) = 

the rotated b x at time r, denoted by B x (r), is derived as: 


e 0 (r) - 80 

eg — 80 5 


( b x , e 0 (r) < 80, 

B x (t) = < [1 - w(t)] b x + w(r)b UtX , 80 < e 0 (r) < eg, (8) 

[ K, x , e 0 (r) > eg. 

Figure [4] shows the results for Japan as an example. The original b x is shown by the black 
curve. The ultimate b UiX , to be reached at life expectancy of 102, is in red. The remaining 
curves show the change over time starting with yellow and continuing towards the red curve. 


2.6 Computing Life Tables 

Step 3 in Algorithm 1 calls for matching future k(r) to projected eg (r). This is a nonlinear 
equation in k(r). It is solved by an iterative nonlinear procedure in which for given values 
of a x , b x and k(r) a life table is produced, and the resulting life expectancy is computed 
and compared with the projected eo(r). We used a bisection method to solve the nonlinear 
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Figure 4: Rotating the parameter b x over time. The original b x (black curve) approaches the 
ultimate b UtX (red curve) over time starting with yellow and continuing towards the darker 
colors. 


equation. This is simple and robust and involves relatively few iterations. It would be 
possible to use a nonlinear solution method that is more efficient computationally, but the 
computational gains would be modest and this could make the method much more complex. 

In the process of computing life tables, the conversion of mortality rates m x (r) to proba¬ 
bilities of dying q x {r) follows the approach used by the United Nations to compute abridged 


life tables. This is computed by the LIFTB function in Mortpak (United Nations, 1988 


2013a), where at a given time point the probability of dying for an individual between age 


x and x + n is: 


Six = 




1 + (n 


x) * nP^x 


(9) 


with n being the length of the age interval and n A x being the average number of years lived 
between ages x and x + n by those dying in the interval. With l x being the number of 
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survivors at age x, we have 


lx+n ~ fc(l nQx)i 

ndx lx lx-\~m 

nLx n-A-xlx 


( 10 ) 

( 11 ) 

( 12 ) 


where n d x denotes the number of deaths between ages x and x + n and n L x denotes the 
number of person-years lived between ages x and x + n. The expectation of life at age x (in 
years) e x is given by 

rj~i OO 

e x = j~ with T x = y ^n.L a , 


where T x is the number of person-years lived at age x and older. 

For ages 15 and over, the expression for n A x is derived from the Greville (1943) approach 
to calculating age-specific separation factors based on the age pattern of the mortality rates 
themselves with: 


, 25, 

tiAe 2.5 —{nmx 


k), where k — — log 
h 10 6 


nP^x+5 


jn 


07—5 


For ages 5 and 10, n A x = 2.5 and for ages under 5, values from the Coale and Demeny West 
region relationships are used for n A x (Coale & Demeny, 1966)0 


2.7 Summary of Improved Algorithm 

We now summarize the modifications described in the previous sections by proposing an 
improved algorithm for deriving the age-specific mortality rates rn x for potential use in 
future probabilistic population projections. 


Algorithm 2 

As before, let t e {t \,..., T} and r G {T + 1,... T p } denote the observed and projected time 
periods, respectively. Also, let g € {F, M} be an index to distinguish sex-specific measures. 


1. Using the Coherent Kannisto Method from Section 2.2, extend m x {t ) to higher age 
categories with max(x) = 130+ for all t. 


2 The Coale and Demeny West region formulae are used as follows. When 0 m 1 ^ 0.107, then i+o = 0.33 
for males and 0.35 for females; 4+1 = 1.352 for males and 1.361 for females. When 47710 < 0.107, ]M 0 = 
0.045 + (2.684 • 47710 ) for males and 4+0 = 0.053 + (2.800 • 1 m 0 ) for females; 4+4 = 1.651 — (2.816 • 4 ? 77 0 ) for 
males and 4+4 = 1.522 — (1.518 • jm 0 ) for females. 
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2. Choose a method to estimate a x , i.e. one of equations ([!]), (J5]) or i§, depending on 
country specifics]^] Do the estimation for each sex g, obtaining a 9 .. 

3. Estimate k(t) and b x using the extended historical m x {t) and equations ([2]-[3]) for g = 
M , F independently, yielding b 9 x . 

4. Given b x r and b x from Step 3, set b x = . 

5. Compute the ultimate b ux as in equation 0- 

6. For a combined eo(r) = \e-^(j) + e^(r)] /2 in each trajectory, compute B x (r) as in 
equation ([8]). 


7. For a given sex-specific Gq(t) in each trajectory and given a 9 and B x (r), solve for future 
kg(r) numerically using life tables. This yields a nonlinear equation which is solved 


using the bisection method, as described in Section 2.6 


8. For each trajectory, time r and sex g, compute mortality rates by log [m 9 x {r)\ = a 9 + 
B x(r)k g (r). 

9. Since the previous step does not comply with equation Q and thus can lead to 
crossovers in high ages, an additional constraint is added: 

If eg -/ ( t) < eo’(r) then 


m^{r) = max[mf(r), m x (r)] for x > 100. 


Figure [5] shows the resulting probabilistic projection of m x (r ) for the period 2095-2100 for 
both sexes in two selected countries. In addition to the marginal distribution for Kazakhstan 
in the right panel of Figure [5j its joint distribution for males and females is shown in Figure[6] 
on a logarithmic scale. Points below the x = y solid line indicate crossovers in the individual 
trajectories. It can be seen that only a few trajectories experience crossovers when mortality 
is low, i.e. in young ages, suggesting a low (but non-zero) probability for such an event, 
while there are no crossovers for high mortality, i.e. in old ages. We observed similar results 
for most countries. Figure [7] shows the same joint distribution for selected age groups on a 
normal scale. 

3 In the bayesPop package this country-specific set of options is controlled through two dummy variables in 
the vwBaseYear2012 dataset: (1) whether the most recent estimate of age mortality pattern should be used 
(LatestAgeMortalityPattern) and (2) whether it should be smoothed (SmoothLatestAgeMortalityPattern). 
See vwBaseYear2012 in R. 
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Figure 5: Probabilistic projection of age-specific mortality rates for Japan (left panel) and 
Kazakhstan (right panel) in the time period 2095-2100. Both plots show the marginal 
distribution for male (blue lines) and female (red lines) where the dashed lines mark the 
80% probability intervals and the solid lines are 20 randomly sampled trajectories (out of 
1000) for each sex. The y-axis is on the logarithmic scale. 


Exceptions 


For about 50 countries, insufficient detailed data about mortality by age and sex are available 
between 1950 and 2010 (United Nations, 2013c). Therefore, the age patterns of mortality are 
based on model life tables (e.g., Coale-Demeny). For these countries a model b x associated 


with one of the regional model life tables is used (see Table 2 page 18 in Li & Gerland 

(|2ont)f1 


In addition, for about 40 countries with a generalized HIV/AIDS epidemic, age patterns 
of mortality since the 1980s have been affected by the impact of AIDS mortality (especially 
before the scaling up of antiretroviral treatment starting in 2005). For these countries the 
application of the conventional Lee-Carter approach is inappropriate J^] Instead, we introduce 
a modification where steps 2-6 in Algorithm 2 are replaced by the following steps: 


4 In the bayesPop package this country-specific set of options is controlled through two variables in the 
vwBaseYear2012 dataset: (1) the type of age mortality pattern used for the estimation period (AgeMortali- 
tyType with the option ’’Model life tables”) and (2) the specific mortality pattern used (AgeMortalityPattern 
with options like ”CD West”). 

5 In the BayesPop package this specific-set of countries are identified through a dummy variable (WP- 
PAIDS) in the vwBaseYear2012 dataset. 
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Figure 6: The joint predictive distribution of mortality rates for females and males for 
Kazakhstan in 2095-2100. It shows mortality rates from all age groups where age groups are 
distinguished by colors. Both axes are on the logarithmic scale. There are 1000 points per 
age group. 


1. Start with the most recent a x (affected by impact of HIV/AIDS on mortality) and 
smooth it as in equation (|6|, obtaining a x . 

2 . Compute an ultimate (or “AIDS-free” target) a x , denoted by a x , which is a smoothed 
average of historical log (m x ) up to 1985 (i.e., prior to the start of the impact caused 
by HIV/AIDS on mortality), denoted by a”: 


a 

a 


V 

X 

u 

X 


y'dh log [m x (t)\ 

— with t T = 19 

rji J- u 

-L U 

smoothzja^} with ag_ 1 = ag_ 1 


(13) 


3. For each x interpolate from a s x to a x assuming that in the long run the excess mortality 
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Figure 7: The joint distribution of mortality rates for females and males for Kazakhstan in 
2095-2100 for individual age groups. In all panels, 1000 points are shown and all axes are 
on a normal scale. 


due to the HIV/AIDS epidemic disappears (or reaches a very low endemic level with 
negligible mortality impact) both as a result of decreased HIV prevalence, improved 
access to treatment and survival with treatment. 

4. During the projections, pick an a x (r) by moving along the interpolated line of the 
corresponding x, so that a“ is reached by 2100. 

5. As above, b x is associated with one of the regional model life tables. 

An example of the resulting projected median age-specific mortality rates for Botswana, 
a country with a generalized HIV/AIDS epidemic, is shown in Figure [8] 

There has been recent progress in the modelling of age patterns of mortality for countries 
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Figure 8: Projected age-specific mortality rates for Botswana, a country with a generalized 
HIV/AIDS epidemic. 


with generalized HIV/AIDS (Sharrow et ah, 2014). This could provide additional options to 


better incorporate the uncertainty about future HIV prevalence, expanded access to treat¬ 
ment, underlying age mortality patterns, and their interaction on overall mortality by age 
into probabilistic population projections. Further calibration and validation of these models 


using empirical estimates from cohort studies (Zaba et al., 2007 Reniers et al., 2014) will be 
important in this context. 


3 Age-Specific Fertility Rates for Probabilistic Popu¬ 
lation Projections 

3.1 WPP 2012 Method of Projecting Age-Specific Fertility Rates 


The United Nations probabilistic population projections released in 2014 (Gerland et al. 


2014) used a set of projected age-specific fertility rates for each country obtained by combin¬ 


ing probabilistic projections of the total fertility rate with deterministic projections of age 


patterns of fertility as used in the 2012 revision of the World Population Prospects (United 


Nations, 2014). 


For high-fertility and medium-fertility countries, future age patterns of fertility were ob¬ 
tained by interpolating linearly between a starting proportionate age pattern of fertility and 
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Eastern Asia 



Figure 9: Example of projected Mean Age of Childbearing (MAC) for countries in Eastern 
Asia in WPP 2012. 


a target model pattern. The target model pattern was chosen from among 15 proportionate 
age patterns of fertility, with mean age at childbearing varying between 24 and 28.5 years. 
The target pattern was held constant once the country reaches its lowest fertility level, or 
by 2045-2050 onward. 

For low fertility countries, a similar approach was used. This projected future age-specific 
fertility patterns by assuming that they would reach a target model pattern by 2025-2030. 
This target was chosen from among five target age patterns of fertility either for the market 
economies of Europe (with mean age of childbearing varying between 28 and 32 years) or 
for countries with economies in transition (with mean age of childbearing varying between 
26 and 30 years). Once the model pattern was reached, it was assumed to remain constant 
until the end of the projection period. In some instances, a modified Lee-Carter approach (Li 


& Gerland, 2009) was used to extrapolate the most recent set of proportionate age-specific 


fertility rates using the rates of change from country-specific historical trends. 

All the trajectories making up the probabilistic projection of fertility for a given country 
used the same age pattern of fertility. The choice of target pattern of fertility for a given 
country, from among the set of model patterns considered, was driven by country-specific 
expert opinion about future trends and normative assumptions. No global or regional con¬ 
vergence in age patterns of fertility was imposed. 

Figure [9] shows the results of the projections for the Mean Age of Childbearing (MAC) 
for countries in Eastern Asia from the 2012 Revision of the World Population Prospects. 
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Overall, the method for projecting age-specific patterns of fertility in the 2012 Revision (as 
well as in previous revisions) has several limitations. First, no global or regional convergence 
has been imposed despite the overall convergence in total fertility rates observed in the 
projection period up to 2100. Second, the time point when the target age-specific pattern 
is reached is not related to the projected total fertility rates. Third, expert assumptions on 
the target age pattern and method used for individual countries introduce diversity in the 
age-specific trends are difficult to explain (see Figure[9]— Mongolia and Democratic People’s 
Republic of Korea were done by Analyst 1, all other countries by Analyst 2). Finally, since 
the analysts have used at least two different methods and 25 target age patterns of fertility, 
the documentation of the decisions made for individual countries have been challenging. 


3.2 Convergence Method for Projecting Age-Specific Fertility Rates 


We now propose a new method for projecting age-specific fertility rates, to overcome some 
of the limitations of the existing method used in WPP 2012. This builds on the approach 
adopted in sets of projections of married or in-union women of reproductive age (MWRA) 


(United Nations, 2013b). Beginning from the most recent observation of the age pattern of 


fertility in the base period of projection, the projected age patterns of fertility are based on 
the past national trend combined with the trend towards the global model age pattern of 
fertility. The projection method is implemented on the proportionate age-specific fertility 
rates (PASFR) covering seven age groups from 15-19 to 45-49. The final projection of 
PASFRs for each age group is a weighted average of two preliminary projections: 

(a) the first preliminary projection, assuming that the PASFRs converge to the global 


model pattern, see Section 3.2.1 and 


(b) the second preliminary projection, assuming the observed national trend in PASFRs 


continues into the indefinite future, see Section 3.2.2 


The method is applied to all the trajectories that make up of the probabilistic projection of 
total fertility rate for all countries, based on the historical data in the WPP 2012 Revision 


(Gerland et ah, 2014). 


We now define the preliminary projections that constitute our overall projection. We use 
different notation than in Section [2j so the same symbol may be used to denote different 
quantities in the two sections. 
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3.2.1 Trend towards the global model pattern 

Let t r denote the base period of a projection and t g the year when the global model pattern 
is reached. For t r < t < t g , the proportion of the interval [t r , t g ] that has elapsed at time t is 


T t = (t- t r )/(tg ~ t r ). 


Section 3.2.4 below gives details about how to estimate t g 


Let p r denote PASFR at the base period t r , and let p g denote PASFR of the global model 
pattern^] The projections at time t of PASFR towards the global model pattern, denoted 
by pj, is obtained by: 


logit (pj ) = logit (p r ) + r t [logit (p g ) - logit (p r ) ] 

Then pj. is renormalized so that it sums to unity for all time periods t. 


(14) 


3.2.2 Continuing of observed national trend 

Let T denote the number of 5-year periods over which the model is fitted. Then t r _x is 
the starting time period of the estimation and P( r -T) is PASFR at t r _ T . pj 1 is the projected 
PASFR at time t, assuming the past trend was to continue into the future under the following 
rule: 

logit (pj 1 ) = logit (p r ) + t [logit (p r ) - logit(p r _ T )] (15) 

Lp Lp — r p 

As above, pj 1 should be scaled to sum to unity for all t. Note that in our implementation 
we use T = 3. 


3.2.3 Resulting projection 

Projected PASFR at time t, p t , is calculated as: 

logit(p t ) = T t ■ logit(pf) + (1 - T t ) logit(pj 6 7 ) (16) 

Resulting p t is renormalized to sums to unity for all time periods t. 

6 In the bayesPop package the global model pattern is created as an average of most recent PASFRs for a 
set of countries (selected through a dummy variable in the vwBaseYear2012 dataset). For the purpose of the 
current analysis, the low fertility countries selected have already reached their Phase III and represent later 
childbearing patterns with mean age at childbearing close to or above 30 years in 2010-2015: Austria, the 

Czech Republic, Denmark, France, Germany, Japan, the Netherlands, Norway and the Republic of Korea. 
The specification of the countries used for the global model pattern can be changed in input file. 
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Figure 10: Trends in Mean Age at Childbearing in countries with the start of Phase III of 
fertility decline before 2000. Dots mark the time period when the country entered Phase III. 


3.2.4 Estimating the time period of reaching global pattern 

We assume that the transition from the most recent age pattern of fertility to the global model 
age pattern of fertility is dependent on the timing when the total fertility rate (TFR) enters 
Phase III, i.e. when the fertility transition is completed and the country reaches low fertility. 
For the countries in Phase III, a time series model to project TFR was used that assumed 
that in the long run fertility would approach and fluctuate around country-specific ultimate 


fertility levels based on a Bayesian hierarchical model (Raftery et ah, 2014). The time series 


model uses the empirical evidence from low-fertility countries that have experienced fertility 
increases from a sub-replacement level after a completed fertility transition. At the same 
time, based on the empirical evidence on the postponement of childbearing in low-fertility 
countries, profound shifts to later start of childbearing and an increase in the mean age of 
childbearing are still taking place several periods after the start of Phase III (see Figure [To]) . 
The timing and speed of the postponement of childbearing in Phase III is country-specific 
and in this paper we implement the assumption that the transition to later childbearing 
pattern is completed when total fertility approaches country-specific ultimate fertility levels. 

To be more specific, we assume that the time t g of a completion of the transition to 
a global model pattern corresponds to the time point t u , when TFR reaches the ultimate 
fertility level of that country. In probabilistic projections of TFR, we approximate the 
ultimate fertility level, denoted by f u , by the median TFR in the last projection period t e , 


21 








e.g. t e = 2095-2100, if TFR is in Phase III: 


f u = median* [fi(t e )] ( i denotes trajectories) (17) 

Then for each TFR trajectory, t u is the earliest time period, at which the TFR is larger or 
equal to f u : 

t u = min{f : f(t) ^ f u and t > t P3 } (18) 

where tp 3 denotes the start of Phase III. For the estimation of t g , we will now distinguish 
two cases, depending if t P 3 is smaller or larger than the end period t e . 

Case 1: t P3 < t e 

In this case, t P3 is either observed ( t P3 ^ t r ) or projected within the projecting period 
( t r < t P3 < t e ). In both cases, if t u exists, 

t g = max(£ u , t r + 10). (19) 

This includes a situation where f(t ) ^ f u for t ^ t r . In such a case, the global pattern is 
reached quickly, namely in two 5-year periods. 

If f{t) < f u for all t r ^ t ^ t e , then t u does not exist. In such a case, t g is set to the end 
of the projection period, but at least five 5-year periods after t P3 : 

t g = max(£ e , t P3 + 25) (20) 

Case 2: t e ^ t P3 

In this case, t P3 is unknown, i.e. the TFR trajectory has not reached Phase III at t e . Thus, 
we will make an estimate of t P3 , denoted by t P3 , and then simply apply 

tg = t P 3 + 25 . (21) 

If the TFR at t e is low, namely f(t e ) ^ 1.8, we assume that t P3 = t e . Otherwise, we 
approximate t P3 by a linear extrapolation of TFR from the last four time periods and 
determine when such line reaches 1.8, with an upper limit of t P3 = t e + 50. 

3.2.5 Exception for late childbearing pattern 

Since trajectories for some countries have already observed or - as projected by the algorithm 
described above - will in near future reach higher MAC than the MAC associated with the 
global model pattern, we assume that for a given country’s trajectory once the maximum 
MAC is reached in the convergence period the associated PASFR pattern is kept constant 
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for the remaining projection periods. This assumption enables to keep trajectory-specific 
patterns of late childbearing for trajectories after the Phase III, thus already with low total 


fertility (see Figure 11 for example of the Czech Republic). Note that this rule is applied 
only in Case 1 above. 


3.3 Results of the convergence method applied to probabilistic 
projections 


For the 2012 Revision, age-specific fertility estimates are based on empirical data for all 
countries of the world for the period up to 2010 (or up to 2010-2015 for 37 countries with 
empirical data up to 2011 or 2012; Gerland et ah, 2014). Using the probabilistic projections 
of TFR, each TFR trajectory has a specific start of Phase III and therefore the timing of 
convergence to the global model pattern is trajectory-specific. This yields a set of trajectories 
of PASFR (although not probabilistic) which in turn, when combined with the probabilistic 
TFR, yield probabilistic projection of age-specific fertility rates. 


Figure 11 shows an example of the results for PASFR in Niger, Bangladesh and Czech 
Republic for selected age groups over time. Figure [12] shows an example of the probabilistic 
results of age-specific fertility rates for Ethiopia, Nepal and Japan at the end of projection 
period in 2095-2100. 

Figure [13] shows the development of PASFR for Uganda, India and Germany over time 
from 2005-2010 to 2095-2100. Here, the methodology was applied to the deterministic pro¬ 
jection of TFR from WPP 2012. 

In Figure [9] we showed projections of MAC from WPP 2012. This can be compared to 
Figure [Tj| where the same measure is shown after applying the new methodology to the TFR 
of WPP 2012. 


Overall, the new method we propose improves on the current methodology in several 
ways. First, in the very long term (after 2100) the age patterns of fertility are converging 
to one global pattern, while retaining specific late childbearing patterns for several countries 
that reach such patterns in the current period or in the near future. Second, the projections 
of the age pattern of fertility are now linked to projections of the total fertility rate. Finally, 
for each probabilistic trajectory, the time when the target age pattern is reached depends on 
the trajectory-specific total fertility rate. 


4 Discussion 

We have described the methods used for converting projected life expectancies at birth and 
total fertility rates to age-specific mortality and fertility rates in the UN’s 2014 probabilistic 
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Figure 11: Proportionate age-specific fertility rates (PASFR) by time for age groups 15-19, 
25-29 and 35-39 in Niger, Bangladesh and the Czech Republic. Projected median of PASFR 
(red line) approaches global model pattern of PASFR (black dashed line). The solid grey 
lines are trajectories that correspond to different starting periods for Phase III; they do not 
represent random samples from a predictive probability distribution. 

population projections. We have identified some limitations of these methods and have 
proposed several improvements to overcome them. These include a new coherent Kannisto 
method to avoid crossovers in mortality rates between the sexes at very high ages. They also 
include the application of a coherent Lee-Carter method, methods for avoiding jump-off bias, 
and a rotated Lee-Carter method to reflect the fact that at high life expectancies, mortality 
rates tend to decline faster at higher than at lower ages. 

It should be noted that the 2014 PPP takes account of uncertainty about the overall 
level of fertility as measured by the TFR, and also about the overall level of mortality 
as measured by e 0 . Conditional on TFR and e 0 , however, the projected vital rates are 
deterministic. There is thus a missing component of uncertainty, and it would be desirable 
to extend the methods used to take account of this, particularly of uncertainty about the 
future mean age at childbearing 


(Ediev, 2013 
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Figure 12: Probabilistic projection of age-specific fertility rates for Ethiopia (left panel), 
Nepal (middle panel) and Japan (right panel) in the time period 2095-2100. The marginal 
distribution for age-specific fertility rates (red lines) where the dashed lines mark the 80% 
probability intervals and the solid grey lines are randomly sampled trajectories are compared 
to age-specific fertility rates in the time period 2005-2010 (blue line) and to the global model 
pattern applied to median projection of total fertility for the world in 2095-2100 (black 
dashed line). 
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Figure 13: PASFR by age over time for selected countries. 


Eastern Asia 



Figure 14: Example of projected MAC for countries in Eastern Asia after applying the 
proposed methodology. 
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